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been the major problem of conceptual data modeling for business needs. Multidimensional 
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different requirements to data modeling techniques. In case of operational systems the 
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Extraction-Transformation-Loading (ETL) tools are pieces of software responsible for the 
extraction of data from several sources, their cleansing, customization and insertion into a 
data warehouse. In this paper, we focus on the problem of the definition of ETL activities 
and provide formal foundations for their conceptual representation. The proposed 
conceptual model is (a) customized for the tracing of inter-attribute relationships and the 
respective ETL activities in the early stages of a dat ... 
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used in ROLAP systems. However, high storage and computation cost make this method 
very difficult to be implemented in the actual environment. Among various issues 
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relation between members. An important application of this structure is to use it to infer 
summarizability, that is, whether an aggregate view defined for some category can be 
correctly derived from a set of precomputed views defined f ... 
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Extraction-Transformation-Loading (ETL) tools are pieces of software responsible for the 
extraction of data from several sources, their cleansing, customization and insertion into a 
data warehouse. In this paper, we focus on the problem of the definition of ETL activities 
and provide formal foundations for their conceptual representation. The proposed 
conceptual model is (a) customized for the tracing of inter-attribute relationships and the 
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approach in data warehousing with decision support applications. In order to enhance 
query perfornnance, the ROLAP approach relies on selecting and materializing in summary 
tables appropriate subsets of aggregate views which are then engaged in speeding up 
OLAP queries. However, a straight forward relational storage implementation of 
materialized ROLAP views is immensely wasteful on storage and incredibly inadeq ... 
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In multidimensional data models intended for online analytic processing (OLAP), data are 
viewed as points in a multidimensional space. Each dimension has structure, described by 
a directed graph of categories, a set of members for each category, and a child/parent 
relation between members. An important application of this structure is to use it to infer 
summarizability, that is, whether an aggregate view defined for some category can be 
correctly derived from a set of precomputed views defined f ... 
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As more and more information is available on the web, it is a problem that many web 
resources are not accessible, i.e., are not usable for users with special needs. For 
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example, for a web page to be accessible, it should give text alternatives (I.e., 
explanatory texts) for images such that blind users that have the web pages read aloud 
automatically also can obtain information about the images. In the European Internet 
Accessibility Observatory (EIAO) project, a crawler that will evaluate the ac ... 
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The Fourth Annual ACM International Workshop on Data Warehousing and Online 
Analytical Processing (DOUVP 2001) was held in Atlanta, GA, USA, in November 2001, in 
conjunction with the Tenth International Conference on Information and Knowledge 
Management (CIKM 2001). Although this was only the fourth annual meeting, DOLAP has 
already become an important and broadly accepted forum for researchers and 
practitioners to share their findings in theoretical foundations, current methodologies, 
practical ... 
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Software design is a complex activity. A successful designer requires knowledge and 
training in specific design techniques combined with practical experience. Designing a 
dimensional model embodies this challenge. This paper presents Dimensional Design 
Patterns (DDPs) and their applications to the design of dimensional models. We describe a 
metamodel of the DDPs and show their integration into Kimball's dimensional modeling 
design process so they can be applied to design problems using a known p ... 
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Choosing the appropriate modeling approach is often the critical factor in the success or 
failure of a data warehousing implementation. 
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A multidimensional database stores data as groups of field category values into 
dimensions^, and then groups these dimensions into multidimensional arrays. Specific field 
category values that may occur in data identify either the rows or columns of array 
dimensions. The specific grouped field categories themselves identify the row. or column 
array dimensions. This view, when presented to the end user, bring in more relevance 
and business sense for practical decision making than the views presented ... 
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Using a common set of attributes to determine which methodology to use in a particular 
data warehousing project. 
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review 

In multidimensional data models intended for online analytic processing (OLAP), data are 
viewed as points in a multidimensional space. Each dimension has structure, described by 
a directed graph of categories, a set of members for each category, and a child/parent 
relation between members. An important application of this structure is to use it to infer 
summarizability, that is, whether an aggregate view defined for some category can be 
correctly derived from a set of precomputed views defined f ... 
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A peer-to-peer (P2P) data management system consists essentially In a network of peer 
systems, each maintaining full autonomy over Its own data resources. Data exchange 
between peers occurs when one of them, in the role of a local peer, needs data available 
- In other nodes, denoted the acquaintances of the local peer. No global schema is assumed 
to exist for any data under this computing paradigm. Henceforth, data provided by an 
acquaintance of a local peer must be adapted. In a manner that an ... 
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This paper describes the Phoenix system, which loads a data warehouse and then reports 
against it. Between the raw atomic data of the source system and the business measures 
presented to users there are many computing environments. Aggregation occurs 
everywhere: initial bucketing by the natural keys on the mainframe, loading the fact table 
using a mapping table, maintaining aggregate tables and reporting tables in the data 
base, In the GUI, In SQL queries issued on behalf of client tools by ... 
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Quer y processi n g: implennenting o perations to navigate sema ntic star schemas Q 
Alberto Abello, Jose Samos, Felix Saltor , 

November 2003 Proceedings of the 6th ACM international workshop on Data 
warehousing and OLAP DOLAP '03 

Publisher: ACM Press 

Full text available* 1?l df(193 82 KB) A^*^'*'^'^^' Information: full citation , abstract , references , citings , index 
u e aval a e.-[2j.a_J — J terms 

In the last years, lots of work have been devoted to multidimensional modeling, star 
shape schemas and OLAP operations. However, "drill-across" has not captured as much 
attention as other operations. This operation allows to change the subject of analysis 
keeping the same analysis space we were using to analyze another subject. It is assumed 
that this can be done if both subjects share exactly the same analysis dimensions. In this 
paper, besides the implementation of an algebraic set of operatio ... 
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The paper explores the manner in which an organization's data and information can be 
effectively utilized to assist an organization to achieve its business objectives. With the 
increased popularity of data warehousing and executive information systems, there is 
renewed interest by IT practitioners in data models and database structures, In particular 
multi-dimensional forms, which have joined their relational counterparts as legitimate 
tools for extracting vital business information from an orga ... 
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In recent years the construction of large scale data schemes for operational systems has 
been the major problem of conceptual data modeling for business needs. Multidimensional 
data structures used for decision support applications in data warehouses have rather 
different requirements to data modeling techniques. In case of operational systems the 
data models are created ifrom application specific requirements. The data models in data 
warehouses base on the analytical requirements of the use ... 

Keywords: conceptual data model, data warehouse, decision support system, entity 
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Extraction-Transformation-Loading (ETL) tools are pieces of software responsible for the 
extraction of data from several sources, their cleansing, customization and insertion into a 
data warehouse. In this paper, we focus on the problem of the definition of ETL activities 
and provide formal foundations for their conceptual representation. The proposed 
conceptual model is (a) customized for the tracing of inter-attribute relationships and the 
respective ETL activities in the early stages of a dat ... 
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Using materialized view to accelerate OLAP queries is one of the most common methods 
used in ROLAP systems. However, high storage and computation cost make this method 
very difficult to be implemented in the actual environment. Among various issues 
associated with this, index selection and view materialization are two of the top 
challenges. In this paper, we propose to build indexes on subsets of the primary keys 
rather than the full sets if the index selectivity for these smaller indexes can be ... 
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The Relational On-Line Analytical Processing (ROLAP) is emerging as the dominant 
approach in data warehousing with decision support applications. In order to enhance 
query performance, the ROLAP approach relies on selecting and materializing in summary 
tables appropriate subsets of aggregate views which are then engaged in speeding up 
OLAP queries. However, a straight forward relational storage Implementation of 
materialized ROLAP views is immensely wasteful on storage and incredibly inadeq ... 
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In multidimensional data models intended for online analytic processing (OLAP), data are 
viewed as points in a multidimensional space. Each dimension has structure, described by 
a directed graph of categories, a set of members for each category, and a child/parent 
relation between members. An important application of this structure is to use it to infer 
summarizability, that is, whether an aggregate view defined for some category can be 
correctly derived from a set of precomputed views defined f ... 
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As more and more information is available on the web, it is a problem that many web 
resources are not accessible, i.e., are not usable for users with special needs. For 
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example, for a web page to be accessible, It should give text alternatives (i.e., 
explanatory texts) for Images such that blind users that have the web pages read aloud 
automatically also can obtain information about the images. In the European Internet 
Accessibility Observatory (EIAO) project, a crawler that will evaluate the ac ... 
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The Fourth Annual ACM International Workshop on Data Warehousing and Online 
Analytical Processing (DOLAP 2001) was held in Atlanta, GA, USA, in November 2001, in 
conjunction with the Tenth International Conference on Information and Knowledge 
Management (CIKM 2001). Although this was only the fourth annual meeting, DOLAP has 
already become an important and broadly accepted forum for researchers and 
practitioners to share their findings in theoretical foundations, current methodologies, 
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Publisher: ACM Press 

Full text available: ^ pdf(216.79 KB) Additional Information: full citation , abstract , references , index terms 

Multidimensional aggregation queries constitute the single most important class of queries 
for data warehousing applications and decision support systems. The bottleneck in the 
evaluation of these queries is the join of the usually huge fact table with the restricted 
dimension tables {star-join). Recently, a multidimensional hierarchical clustering schema 
for star schemas is suggested. Subsequently, query evaluation plans for multidimensional 
queries appeared that essentially implement a ... 
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On-line analytical processing (OLAP) is a technology that encompasses applications 
requiring a multidimensional and hierarchical view of data. OLAP applications often 
require fast response time to complex grouping/aggregation queries on enormous 
. quantities of data. Commercial relational database management systems use mainly 
multiple one-dimensional Indexes to process OLAP queries that restrict multiple 
dimensions. However, in many cases, multidimensional access methods outperform one- 
dimenslona ... 
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Using connnnon subexpressions to speed up a set of queries is a well known and long 
studied problenn. However, due to the isolation requirennent, operating a database in the 
classic transactional way does not offer many applications to exploit the benefits of 
simultaneously computing a set of queries. In the opposite, many applications can be 
identified in the context of data yyarehousing, e, g, optimizing the incremental 
maintenance process of multiple dependent materialized views or the generation ... 
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Data warehouses contain large amounts of information, often collected from a variety of 
independent sources. Decision-support functions in a warehouse, such as on-line 
analytical processing (OLAP), involve hundreds of complex aggregate queries over large 
volumes of data. It is not feasible to compute these queries by scanning the data sets 
each time. Warehouse applications therefore build a large number of summary tables, or 
materialized aggregate views, to ... . 
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The DWS (Data Warehouse Striping) technique allows the distribution of large data 
warehouses through a cluster of computers. The data partitioning approach partition the 
facts tables through all nodes and replicates the dimension tables. The replication of the 
dimension tables creates a limitation to the applicability of the DWS technique to data 
warehouses with big dimensions. This paper proposes a strategy to handle large 
dimensions in a distributed DWS system and evaluates the proposed str ... 
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This paper proposes a novel way of automatically developing data warehouse 
configuration in rule-based CRM systems. Rule-based CRM systems assume that 
marketing activities are represented as a set of IF-WHEN rules. Currently, to provide 
good quality CRM functionalities, CRM systems seek to combine conventional CRM 
methodologies with data warehousing technology. A data warehouse can be abstracdy 
seen as a set of materialized views. Selecting views for materialization in a data 
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A star schenna is very popular for modeling data warehouses and data marts. Therefore, it 
is important that a database system which is used for implementing such a data 
warehouse or data mart is able to efficiently handle operations on such a schema. In this 
paper we will describe how one of these operations, the join operation — probably the 
most important operation — is implemented In the IBM Informix Extended Parallel Server 
(XPS). 
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TPC-DS is a new decision support benchmark currently under development by the 
Transaction Processing Performance Council (TPC). This paper provides a brief overview of 
the new benchmark. The benchmark models the decision support functions of a retail 
product supplier, including data loading, multiple types of queries and data maintenance. 
The database consists of multiple snowflake schemas with shared dimension tables; data 
is skewed; and the query set is large. Overall, the benchmark is conside ... 

Keywords: TPC, benchmark, data warehouse, decision support, performance evaluation 

9 Dynamic maintenance of multidimensional range data partitioning for parallel data I I 

^ process ing 

Junping Sun, William I. Grosky 

November 1998 Proceedings of the 1st ACM international workshop on Data 
warehousing and OLAP DOLAP '98 

Publisher: ACM Press 

Full text available: ^ pdf f 1.Q9 MB ) Additional Information: full citation , references , citings , index terms 



Dealing with slow-evolving fact: a case study on inventory data warehousin g 
^ Chung-Min Chen, Munir Cochinwala, Elsa Yueh 

^ November 1999 Proceedings of the 2nd ACM international workshop on Data 
warehousing and OLAP DOLAP '99 

Publisher: ACM Press 

Full text available: ^ pdf(941.25 KB) Additional Information: full citation , abstract , references , index terms 



http://portal.acm:org/resultsxfm?coll=ACM&dl=ACM&CFID=17252645&CFTO 3/18/2007 



Results '(page 1): "selecting web" + "dimension table" + "specifying query" + "fact table" ... Page 4 of 7 



Data Warehousing for INventory management (DWIN) is a production project at Telcordia 
aimed at providing telecommunications sen/ice providers with decision support functions 
for inventory control and monitoring. In this paper, we report some interesting issues 
related to the design of the data warehouse. Specifically, we will discuss the issues of 
slow-evolving fact, transaction-oriented fact table, and large dimensions. We also propose 
the concept of virtual data cubes and ... 
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This paper describes the Phoenix system, which loads a data warehouse and then reports 
against it. Between the raw atomic data of the source system and the business measures 
presented to users there are many computing environments. Aggregation occurs 
everywhere: Initial bucketing by the natural keys on the mainframe, loading the fact table 
using a mapping table, maintaining aggregate tables and reporting tables in the data 
base, in the GUI, in SQL queries issued on behalf of client tools by ... 
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User queries have been becoming increasingly connplex (e.g., involving a large number of 
joins) as database technology is applied to some application domains such as data 
warehouses and life sciences. Query optimizers in existing database management 
systems often suffer from Intolerably long optimization time and/or poor optimization 
results when optimizing large join queries. One possible solution to tackle these problems 
is to rewrite a user-specified complex query into another form that can be ... 

Keywords: complex query, database management system, query graph, query 
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In the last years, lots of work have been devoted to nnultidimensional nnodeling, star 
shape schemas and OLAP operations. However, "drill-across" has not captured as much 
attention as other operations. This operation allows to change the subject of analysis 
keeping the same analysis space we were using to analyze another subject. It is assumed 
that this can be done if both subjects share exactly the same analysis dimensions. In this 
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paper, besides the implementation of an algebraic set of operatio ... 
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Some significant progress related to multidimensional data analysis has been achieved in 
the past few years, including the design of fast algorithms for connputing datacubes, 
selecting some precomputed group-bys to materialize, and designing efficient storage 
structures for multidimensional data. However, little work has been carried out on 
multidimensional query optimization issues. Particularly the response time (or evaluation 
cost) for answering several related dimensional queries simultaneous ... 
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The read-mostly environment of data warehousing makes it possible to use more complex 
indexes to speed up queries than in situations where concurrent updates are present. The 
current paper presents a short review of current indexing technology, including row-set 
representation by Bitmaps, and then introduces two approaches we call Bit-Sliced 
indexing and Projection indexing. A Projection index materializes all values of a column in 
RID order, and a Bit-Sliced index essentially takes an orth ... 
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Data warehousing and on-line analytical processing (OLAP) are essential elements of 
decision support, which has increasingly become a focus of the database industry. Many 
commercial products and services, are now available, and all of the principal database 
management system vendors now have offerings in these areas. Decision support places 
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some rather different requirements on database teclinology compared to traditional on- 
line transaction processing applications. This paper provides an overview ... 
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Accurate estimation of the sizes of intermediate query results (cardinality estimation) is of 
critical importance to plan costing in query optimization. The common practice in current 
commercial database systems such as IBM DB2 Universal Database (DB2 UDB) is to 
derive the cardinality estimates from base-table statistics. However, this approach often 
suffers from simplifying yet unrealistic assumptions that have to be made about the 
underlying data (for example, different attributes are independ ... 
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In recent years the construction of large scale data schemes for operational systems has 
been the major problem of conceptual data modeling for business needs. Multidimensional 
data structures used for decision support applications in data warehouses have rather 
different requirements to data modeling techniques. In case of operational systems the 
data models are created from application specific requirements. The data models in data 
warehouses base on the analytical requirements of the use ... 
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A database server used for implementing a data warehouse must support other features 
than a database server used for OLTP. Therefore, in this paper we will look specifically at 
features necessary for efficiently processing queries on a database with a star schema 
model, a database scheme which is used very often in data warehousing. We will 
especially analyze the features provided for this by the IBM Extended Parallel Server 
(XPS). There are special star join methods lil<e the Push-Down Hash Semi ... 
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